181 research outputs found
Efficient Generator of Mathematical Expressions for Symbolic Regression
We propose an approach to symbolic regression based on a novel variational
autoencoder for generating hierarchical structures, HVAE. It combines simple
atomic units with shared weights to recursively encode and decode the
individual nodes in the hierarchy. Encoding is performed bottom-up and decoding
top-down. We empirically show that HVAE can be trained efficiently with small
corpora of mathematical expressions and can accurately encode expressions into
a smooth low-dimensional latent space. The latter can be efficiently explored
with various optimization methods to address the task of symbolic regression.
Indeed, random search through the latent space of HVAE performs better than
random search through expressions generated by manually crafted probabilistic
grammars for mathematical expressions. Finally, EDHiE system for symbolic
regression, which applies an evolutionary algorithm to the latent space of
HVAE, reconstructs equations from a standard symbolic regression benchmark
better than a state-of-the-art system based on a similar combination of deep
learning and evolutionary algorithms.\v{z}Comment: 35 pages, 11 tables, 7 multi-part figures, Machine learning
(Springer) and journal track of ECML/PKDD 202
Dejavniki vetroloma na primeru vetroloma na Pokljuki
This paper presents a case study in windthrow. The case study area was 1.7 ha of two forest gaps on the Pokljuka plateau, Slovenia, where strong wind had blown down 44 trees. An additional 44 standing trees closest to the fallen trees were used as a control group for comparative purposes. The following variables were measured for fallen trees: breast diameter, height, crown diameter and height as well, the number and diameter of roots, the volume of the root system, and root rot. Standing trees were measured for breast diameter, height, crown diameter and height, and the number and diameter of roots. The data were analysed using the machine learning methods in the Weka computer program. The most important factors of windthrow in the case study area were: storm wind (speed above 17 m/s), wet shallow soil, and the edges ofthe forest gaps. The results of the case study show that breast diameter, tree height and the presence of root rot can be classified as windthrow factors.V raziskavi smo izdelali študijo primera vetroloma, ki je zajemala dve vrzeli,veliki 1,7 ha. V vrzelih je viharen veter podrl 44 dreves. Za primerjavo smo vzeli še 44 najbližjih stoječih dreves. Podrtim drevesom smo izmerili prsni premer, višino, širino in višino krošnje, število in debelino korenin, izračunali volumen koreninskega sistema ter vzeli izvrtek, s katerim smo ugotavljali trohnobo. Najbližje stoječim drevesom smo izmerili prsni premer, višino, širino in višino krošnje, število in debelino korenin. Analizopodatkov smo poleg statističnih obdelav izvedli tudi z metodami strojnega učenja v računalniškem programu Weka. Najpomembnejši dejavniki za podrtje dreves na mestu študije primera so bili: viharen veter (hitrost nad 17m/s), razmočena in plitva tla ter gozdni rob vrzeli. Rezultati raziskave so pokazali, da so pomembno vplivali k podrtju dreves tudi prsni premer, višina dreves in trohnoba
Semi-supervised Predictive Clustering Trees for (Hierarchical) Multi-label Classification
Semi-supervised learning (SSL) is a common approach to learning predictive
models using not only labeled examples, but also unlabeled examples. While SSL
for the simple tasks of classification and regression has received a lot of
attention from the research community, this is not properly investigated for
complex prediction tasks with structurally dependent variables. This is the
case of multi-label classification and hierarchical multi-label classification
tasks, which may require additional information, possibly coming from the
underlying distribution in the descriptive space provided by unlabeled
examples, to better face the challenging task of predicting simultaneously
multiple class labels.
In this paper, we investigate this aspect and propose a (hierarchical)
multi-label classification method based on semi-supervised learning of
predictive clustering trees. We also extend the method towards ensemble
learning and propose a method based on the random forest approach. Extensive
experimental evaluation conducted on 23 datasets shows significant advantages
of the proposed method and its extension with respect to their supervised
counterparts. Moreover, the method preserves interpretability and reduces the
time complexity of classical tree-based models
- …